14 research outputs found
Synthesizing Normalized Faces from Facial Identity Features
We present a method for synthesizing a frontal, neutral-expression image of a
person's face given an input face photograph. This is achieved by learning to
generate facial landmarks and textures from features extracted from a
facial-recognition network. Unlike previous approaches, our encoding feature
vector is largely invariant to lighting, pose, and facial expression.
Exploiting this invariance, we train our decoder network using only frontal,
neutral-expression photographs. Since these photographs are well aligned, we
can decompose them into a sparse set of landmark points and aligned texture
maps. The decoder then predicts landmarks and textures independently and
combines them using a differentiable image warping operation. The resulting
images can be used for a number of applications, such as analyzing facial
attributes, exposure and white balance adjustment, or creating a 3-D avatar
Photo Editing with Face Selection and Replacement
This disclosure describes techniques to enable users to edit photos to include selected faces or facial expressions. A user can select a photo from a burst or other collection of photos. Detected faces in the selected photo are highlighted in a user interface that enables a user to select a face in the photo to modify. In response, a set of candidate faces that are suitable to replace the selected face are presented in the user interface. With user permission, the candidate faces can be obtained and/or modified from other accessible photos, such as from a burst of photos. The user can select any candidate face that seamlessly replaces the selected face in the displayed photo. The described interface allows users to quickly and easily replace undesired facial expressions in photos with preferred facial expressions
Idempotent Generative Network
We propose a new approach for generative modeling based on training a neural
network to be idempotent. An idempotent operator is one that can be applied
sequentially without changing the result beyond the initial application, namely
. The proposed model is trained to map a source distribution
(e.g, Gaussian noise) to a target distribution (e.g. realistic images) using
the following objectives: (1) Instances from the target distribution should map
to themselves, namely . We define the target manifold as the set of all
instances that maps to themselves. (2) Instances that form the source
distribution should map onto the defined target manifold. This is achieved by
optimizing the idempotence term, which encourages the range of
to be on the target manifold. Under ideal assumptions such a process
provably converges to the target distribution. This strategy results in a model
capable of generating an output in one step, maintaining a consistent latent
space, while also allowing sequential applications for refinement.
Additionally, we find that by processing inputs from both target and source
distributions, the model adeptly projects corrupted or modified data back to
the target manifold. This work is a first step towards a ``global projector''
that enables projecting any input into a target data distribution
MyStyle: A Personalized Generative Prior
We introduce MyStyle, a personalized deep generative prior trained with a few
shots of an individual. MyStyle allows to reconstruct, enhance and edit images
of a specific person, such that the output is faithful to the person's key
facial characteristics. Given a small reference set of portrait images of a
person (~100), we tune the weights of a pretrained StyleGAN face generator to
form a local, low-dimensional, personalized manifold in the latent space. We
show that this manifold constitutes a personalized region that spans latent
codes associated with diverse portrait images of the individual. Moreover, we
demonstrate that we obtain a personalized generative prior, and propose a
unified approach to apply it to various ill-posed image enhancement problems,
such as inpainting and super-resolution, as well as semantic editing. Using the
personalized generative prior we obtain outputs that exhibit high-fidelity to
the input images and are also faithful to the key facial characteristics of the
individual in the reference set. We demonstrate our method with fair-use images
of numerous widely recognizable individuals for whom we have the prior
knowledge for a qualitative evaluation of the expected outcome. We evaluate our
approach against few-shots baselines and show that our personalized prior,
quantitatively and qualitatively, outperforms state-of-the-art alternatives.Comment: Project webpage: https://mystyle-personalized-prior.github.io/,
Video: https://youtu.be/QvOdQR3tlO
Separating Signal from Noise using Patch Recurrence Across Scales
Recurrence of small clean image patches across different scales of a natural image has been successfully used for solving ill-posed problems in clean images (e.g., superresolution from a single image). In this paper we show how this multi-scale property can be extended to solve ill-posed problems under noisy conditions, such as image denoising. While clean patches are obscured by severe noise in the original scale of a noisy image, noise levels drop dramatically at coarser image scales. This allows for the unknown hidden clean patches to “naturally emerge ” in some coarser scale of the noisy image. We further show that patch recurrence across scales is strengthened when using directional pyramids (that blur and subsample only in one direction). Our statistical experiments show that for almost any noisy image patch (more than 99%), there exists a “good ” clean version of itself at the same relative image coordinates in some coarser scale of the image. This is a strong phenomenon of noise-contaminated natural images, which can serve as a strong prior for separating the signal from the noise. Finally, incorporating this multi-scale prior into a simple denoising algorithm yields state-of-the-art denoising results. 1